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The process of consensus voting, or decision making by unanimous agreement, has many 
^r \ • distinct advantages: it fosters discussion and participation, empowers minorities and indepen- 

fvi , dent thinkers, and is more likely, after a decision has been made, to secure the participants' 

support for the chosen course of action. These considerations, among others, have lead many 
institutions to adopt consensus voting as a practical method of decision making. 

The disadvantage of consensus decision making is, of course, the difficulty of reaching con- 
sensus. While this challenge is largely overcome in many theoretical settings such as Aumann's 
,-J^ "agree to disagree" result and its related literature, a hitherto unsolved difficulty is the lack of 

a framework offering rational (i.e., Bayesian) consensus decision making that can be performed 
using simple and efficient calculations. 

We study a stochastic model featuring a finite group of agents that have to choose between 
one of two courses of action. Each member of the group has a private and independent signal at 
his or her disposal, giving some indication as to which action is optimal. To come to a common 
decision, the participants perform repeated rounds of voting. In each round, each agent casts 
iy~j \ a vote in favor of one of the two courses of action, reflecting his or her current conditional 

0^ ■ probabilities, and observes the votes of the rest in order to calculate an updated conditional 

probability. 
J*"-. ■ We prove four results: 
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1. Consensus is always reached. 

2. Each round of voting improves the aggregation of information. 

3. The chance of a correct decision quickly approaches one as the number of agents increases. 
This is achieved already at the second round of voting. 



4. Most importantly, we provide an efficient algorithm for the calculation the agents have to 
perform. 
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1 Introduction 

Consensus voting, or decision by unanimous agreement, is a method of communal governance that 
requires all members of a group to agree on a chosen course of action. The European Union's Treaty 
of Lisbon [5] decrees that "Except where the Treaties provide otherwise, decisions of the European 
Council shall be taken by consensus. " In this the EU follows the historical example of the Diet of 
the Hanseatic League |llj and others. 

Proponents of this method consider it to have many advantages over majority voting: it culti- 
vates discussion, participation and responsibility, and avoids the so-called "tyranny of the major- 
ity". The drawback is, of course, a lengthy and difficult decision making process, lacking even the 
guarantee of a conclusive ending. 

However, in standard theoretical setups of rational Bayesian participants (e.g. [ID], ]6j), agents 
cannot "agree to disagree" [I], and consensus is eventually reached. Unfortunately, this may come at 
the price of tractability; Bayesian calculations can, in some situations, be practically impossible [6|. 

Indeed, an inherent conflict between rationality and tractability has been a driving force in 
behavioral economics since its foundation [16] . It seems that in most situations it is practically 
impossible to calculate which course of action is optimal, leaving the theoretician with a model 
that is either not rational, and thus hard to justify, or not tractable and hence not realistic. 

We consider a model describing a group of Bayesian agents that have to make a binary decision. 
We show that under the dynamics we describe, unanimity is reached with probability one, and give 
a simple algorithm for the agents' calculations. 

Our model features a finite group of Bayesian agents that have to choose between two possible 
courses of action. Each initially receives a private and independent signal, which contains some 
information indicating which action is more likely to be the correct one. The agents participate in 
rounds of voting, in which each indicates which action it thinks is more likely to be correct, and 
learns the others' opinion thereof. The process continues until unanimity is reached. The Bayesian 
agents are myopic, so their actions are not strategic, but truthfully reflect the information available 
to them. They are rational in the sense that they do not follow heuristics or rules of thumb, but 
choose the vote (or action) that is optimal, given what they know. 

As an example, consider a committee that has to decide whether or not to accept a candidate 
for a position, who a-priori has a chance of one half to be a good hire. Each committee member gets 
to interview the candidate in private. If the candidate is good - i.e., the correct action is "hire", 
then each committee member i receives a private signal Ai drawn independently from N(l, 1), the 
normal distribution with expectation 1 and variance 1. If the candidate is bad (i.e., "don't hire"), 
then Ai is drawn from JV(— 1, 1). One can think of the private signal Ai as quantifying the total 
impression the candidate made on committee member i during the interview. 

The committee now commences to vote in rounds. In each round of voting each member casts 
a public "hire" or "don't hire" vote, depending on which it thinks is more likely to be the correct 
decision, according to the appropriate Bayesian calculation. Each member also observes the votes 
of the other members. As mentioned above, we show that the committee members will eventually 
all cast the same vote, and that the calculations they need to perform in order to calculate their 
votes are simple. 

One may, of course, raise the objection that the process could be made simpler if the committee 
members were to tell each other their private signals, in which case the optimal answer would be 
arrived at immediately. However, a common assumption in the study of Bayesian agents (cf. [2] [3] 
\TT\ |6j) is that "actions speak louder than words", so that agents learn from each others' actions 
rather than revealing to each other all their information. The latter option may not be feasible, 
as the said information may consist of experiences and impressions that could take too long to 



explain, may be difficult to articulate, or may not be even consciously known. 

1.1 Related work 

Studies in committee mechanism design (cf. [7]) strive to construct mechanisms for eliciting 
information out of committee members to arrive at optimal results. We, by and large, do not take 
this path but consider a "natural" setting which was not specifically constructed to achieve any 
such goal. 

Our work is more closely related to models of herd behavior (cf . [21 [31 [T7j ) . These feature a group 
of agents with a state of the world and corresponding private signals, much like ours. However, 
there the agents are exogenously ordered, and each takes a single action after seeing - and learning 
from - the actions of its predecessors. There too agents don't observe each others' private signals 
but only actions. 

In the model of Gale and Kariv |6J agents act repeatedly, but each agent observes the actions 
of only some of the other agents. Also, a much more general structure of the state of the world and 
private signals is used. Hence our model is a special case of the model they study. 

1.2 Model 

Our model features a finite set of agents [n] = {1, 2, . . . , n} that have to make a binary decision 
regarding an unknown state of the world S € {0, 1}. Each is initially given a private signal Ai, 
distributed /io if S = and /ii if S = 1, and independent of the other signals, conditioned on S. 

Definition 1.1. Let /io and Hi be measures on aa-algebra {VL,0) satisfying the following conditions: 

1. The Radon- Nikodym derivative ^-(w) exists and is non-zero for all uj G S7. 

2. Let A be distributed |/io + 5 Ml; and let X = log^-(yl). Then the distribution of X is 
non-atomic. 

Note that ([2]) implies that /io ^ fi\, since otherwise X = a.s. and thus its distribution is 
atomic. 

Definition 1.2. Let /io and fi\ be measures on a a-algebra (£1,0). Let So,Si denote the measures 
on {0, 1} that satisfy 5 (0) = <5i(l) = 1 and d (l) = S t (Q) = 0. 

Consider the probability space {0, 1} x il n . Given an element (S,A\, . . . ,A n ) € {0, 1} x il n we 
call S the state of the world and call Ai agent i 's private signal. 

Let P be the following measure on the probability space {0, 1} x Q n : 

p= i«$o®Aio® n + 5<*i®/zi® n - (1) 

Equivalently, 5 is picked uniformly from {0, 1}, and conditioned on S, the agents' private signals 
Ai are distributed i.i.d. fis- It is important to note that conditioned on S the private signals Ai 
are independent. 

The agents participate in a process of voting rounds. In each round t each agent i casts a 
public vote Vi(t) £ {0, 1}, depending which of the two is more likely to be the state of the world, 
conditioned on the information available to i; this includes Ai as well as the votes of the other 
agents in the previous rounds. 



Definition 1.3. For t E {1,2,...} and agent i € [n], let Vi{t), the vote of agent i at time t, be 
defined by 

I otherwise 

where V 1 = {Vj(t') : j S [n], t' < t} denotes the votes of all agents up to time t. 
Alternatively, one could define 

Vi{t) = argmaxP [S = s\A u V l ~ l ] , (3) 

se{o,i} 

with a "tie breaking law" specifying that when the conditional probability on the r.h.s. is half 
then the vote is 0. We note that it is easy to see that the assumption that the distribution of the 
Radon-Nikodym derivatives -£^-{Ai) is non-atomic (definition II. ip implies that the probability of 
ever encountering a tie is a.s. and therefore the details of the tie breaking rule a.s. do not effect 
the behavior of the process or our results. 

1.3 Results 

For the model defined in definitions 11.11 11.21 and 11.31 we prove the following theorems. 

• Unanimity: A unanimous decision is always reached. That is, with probability one all agents 
vote identically at some round, and the process essentially ends. 

Theorem 1.4. With probability one there exists a time t u and a vote V € {0, 1} such that 
for all t >t u and agents i it holds that Vi(t) = V . 

• Monotonicity: The probability that an agent votes correctly is non-decreasing with the 
progression of rounds. 

Theorem 1.5. For all agents i and times t > 1, it holds that 

F[Vi(t) = S]>F[Vi(t-l) = S]. 

• Asymptotic Learning: The probability of reaching a correct decision at the end of the 
process approaches one as the number of agents increases. In fact, this already holds by the 
second round of voting. 

Theorem 1.6. Fix fio and fi\, and let n be the number of agents. Then there exist constants 
C = C(//o,Mi) an d n o = ^o (/■*() ; Mi) such that 

P[Vi: Vi(2) = S}> l-e- Cn 

for all n> uq. 

• Tractability: In order to discuss tractability we must assume that certain calculations related 
to the distributions \x\ and fiQ take constant time, or alternatively that the algorithm has 
access to an oracle which preforms them in constant time. Specifically, we define below 
(definition 12. ip the log- likelihood ratio X = dni/d^iQ and its conditional distributions vq and 
u\, and assume that their cumulative distribution functions can be calculated in constant 
time. Then we show that the agents' computations are tractable: 



Theorem 1.7. Fix [iq and [i\, and let n be the number of agents. Assume that X, as well as 
the cumulative distribution functions of vq and v\, can be calculated in constant time. Then 
there exists an algorithm with running time 0{nt), which, given i's private signal Aj, and the 
votes V l ~ l = {Vj(t') : j G [n], t' < t}, calculates Vi(t), agent i's vote at time t. 

We in fact provide a simple algorithm that performs this calculation. 

1.4 Comparison to majority voting 

Apart from being computationally easier, majority voting is inferior in every one of the above 
senses. In particular, it doesn't aggregate information as well as consensus voting. Furthermore, it 
may not have the asymptotic learning property. Consider the following example: A committee 
has to decide whether or not to accept a candidate for a position. Each member of the committee 
interviews the candidate and forms an opinion. Now, assume that a good candidate will make a 
favorable impression nine times of out ten, whereas a bad candidate will make a favorable impression 
six times out often (being good enough to have made it to the interview stage). In this case, with 
overwhelming probability (i.e., with probability that tends to one as the size of the committee 
increases), when the candidate is bad, about sixty percent of the committee members will still 
have a good impression, and consequently a decision by majority will come to the wrong decision, 
namely that the candidate is good. 

This flaw is rectified by a second vote: after seeing the results of the first round of voting, the 
committee members will realize that too few of them had a good impression, and will vote against 
the bad candidate in the second round. Indeed, we prove below that asymptotic learning is always 
achieved after two voting rounds. This suggests that in situations where voting until convergence 
is impractical, it may be beneficial to have more than one round of voting. 

Another characteristic advantage of consensus voting is that the strengths of the participants' 
convictions counts. Consider a situation in which each agent's private signal is, with high proba- 
bility, independent of the state of the world, but with some probability provides very convincing 
evidence. While a single agent possessing the said "smoking gun" would have little impact in a 
majority vote, his or her insistence in subsequent rounds would convey the weight of the evidence 
to the rest of the group. 

1.5 Related work 

In 1785 the Marquis de Condorect proved a founding result [4J in the field of group decision making. 
The Condorect Jury Theorem states that given that each member of a jury knows the correct verdict 
with some probability p > 1/2, the probability that the jury reaches a correct decision by a majority 
vote goes to ones as the size of the jury increases. Our "asymptotic learning" result extends this 
theorem to a more general class of private signals, given that at least two rounds of voting are 
carried out. 

Sebenius and Geanakoplos |15| show in a classical paper that a pair of agents eventually reach 
agreement on the "state of with world" in a model similar to ours, with finite probability spaces. 
A consequence of the convergence proof given by Gale and Kariv [6J as well as Rosenberg, Solan 
and Vieille |14j (to models which are generalizations of ours) is that if a pair of agents' actions 
converge, it is to the same value. We provide a basic proof of a stronger result, namely that all 
the voters converge, and to the same vote. We further show that each round of voting increases 
the probability that a given agent votes for the better alternative, and that this probability goes to 
one at the second round of voting, as the number of participants goes to infinity. Finally, our most 



significant improvement over the work of Gale and Kariv [6j is that we provide the participants 
with a simple and efficient algorithm to calculate their votes. 

In a subsequent work to this paper, together with Allan Sly [T2], we show that for very general 
voting models, including the Gale and Kariv model, asymptotic learning holds in the sense that as 
the number of voters go to infinity the probability of convergence to the correct outcome goes to 
1. For the general models studied in [12] no efficient update algorithms are known and no rates of 
convergence (both in terms of the size of the graph and in terms of the number of iterations) are 
currently known. 

2 Proofs 

Before proving our theorems we make some additional definitions. We start by defining the log- 
likelihood ratio x and its conditional distributions vq and v\. 



Definition 2.1. Let x : K — > R be given by 



dm 



xH = log-fiH. (4) 

Let vq be the distribution of x(A) when A ~ /j,q an d let u\ be the distribution of x(A) when 

A ~ m- 

Note that if (S, A) ~ ^o ® Mo + \&i ® m-i an d X = x(A) then 

•[A\S= 1] 



x(A) = log ■ 

i.e., x(A) is the log-likelihood ratio of S given A. 
In the proofs that follow we denote 



•[A\s = oy 



X { = x{Ai) 

agent i's private log-likelihood ratio. 

In our analysis below an event that we often encounter is a < Xi < b, and hence the following 
definition will be useful. 

Definition 2.2. Let x : M 2 — > M be given by 

x(a b) - lot: m( ° < " ~ b) (5) 

Note that if (S, A) ~ ^5q <8> ^o + h$i ® m> an< ^ X = x(A) then 

[a < X < b\S = 1] 



x(a, b) = log 



[a<X <b\S = 0Y 



i.e., x(a,b) is the log-likelihood ratio of S given that X is between a and b. 

The following claim follows by application of Bayes' law to Eq. [3 It follows from this claim 
that if the cumulative distribution functions of v§ and v\ can be calculated in constant time then 
so can x(-, •). 



Claim 2.3. 

x{a, b) = log . (6) 

^olX < o) — z/o(A < a) 

We shall need the following easy claim in some of the proofs below. 

Claim 2.4. Let /xoi A*i be such that dfj,i/d/J,o exists and is non-zero for all uj, and let 

x{w) = log— — (w). 

iJei ^o be the distribution of x = x(A) when A ~ hq, and let v\ be the distribution of x = x(A) 
when A ~ \x\. Then dv\/duQ exists and 

i dv\ , . ,_. 

x = \og-—[x). (7) 

du 

Proof. Since the Radon-Nikodym derivative dfii/dfJ,o exists and is non-zero, then \i\ and (io are 
absolutely continuous with respect to each other, by the Radon-Nikodym theorem [13] . Since x(A) 
is a function of A then it follows that v\ and vq are also absolutely continuous with respect to each 
other, and so dv\jdvQ exists and is non-zero. 

Let (S, A) ~ i#o ® /Uo + 5^1 ® Ml> an d denote X = cc(A). Let 

M = P[S = 1\A] =E[S\A\. 
Then X = log(M/(l — M)). By the law of total expectation we have that 

E[S\M] =EpE[5|Af,A]|M]. 

Since M is a function of A then E [S\M, A] = E [S\A] = M and it follows that 

E[S\M] =E[M\M] = M. 

Now from the definition of X it follows that X = log(M/(l — M)) and therefore there is a one-to-one 
correspondence between X and M. Hence E [S\X] = E [S'jM] = M. Therefore 

P [5 = 11X1 M 

log p[s = o\x] =log T^M =x - 

But the l.h.s. of this equation is by Bayes' law equal to 

'[X\S = 1] 



log : 



[x\s = oy 

duo 



which is equal to -p-(X), and thus we have that 



* = *£<*>■ 



D 



2.1 Tractability 

The key observation behind our tractability proof is the following. Let i and j be two agents. Since 
j knows all that i knows except A%, then even before i votes, j can know for which values of A% i 
would vote 1, and for which it would vote 0. In fact, we show below that there is a bound (that j 
can calculate) such that if Xi is below that bound then i will vote and otherwise i will vote 1. 

Thus at each voting round, j gains either a lower bound or an upper bound on Xi. What we in 
fact show is that calculating a lower bound Ai(t) and an upper bound Bi(t), on all other agents' 
private likelihood ratios Xi, is almost all that j needs to do to calculate its votes. 

The following is the definition of these lower bounds Ai(t) and upper bounds Biit). 

Definition 2.5. Let Vi(t) G {0,1} for i G [n] and t G N. Similarly to definition li.31 let v l = 

{vi{t') : i G [n], t' < t} denote an element of {0, l} nt . For i G [n] and t > 0, let o» : {0, l} nt ->■ R 
and bi : {0, l} nt —^ R be the functions recursively defined by 



iHKv 1 ) = max < - \^x ( aAv 1 ' 1 ),bj(v t ' 1- 

t'<t s.t. w i (f)=i I ^ V 



(8) 



and 



bi(v ) = min 

t'<t s.t ■u l (t')=o 






(9) 



where the minimum (resp., maximum) over the empty set is taken to be infinity (resp., minus 
infinity). 

Let Ai(t) and Biit) be the random variables defined by 

Mt) = ativ*) 

and 

Bi(t) = biiV 1 ). 

Note that Ai(t) is non-decreasing and Bi(t) is non-increasing, in t. 

Since, as we show below, Ai(t) and Bi(t) are lower and upper bounds on Xi at time t, we shall 
need to often refer to x(Ai(t),Bi(t)), and hence denote 

X t (t) = x(A i (t),B i (t)) =x(a i (V t ),b l (V t )). (10) 

Recall that Vi(t), agent i's vote at time t, depends on whether or not P [S = lL4j,t^ t_1 ] is 
greater than half or not: 

F . (t) = |l HP[S = l\A l ,V t - 1 ] >l/2 
1 otherwise 

Let 

P\S = l\A i ,V t - 1 } , x 

Yi(t) = log — ^ =- - . 11 

Then 

V-(t) = 1 iff Yi(t) > 0. (12) 



Theorem 2.6. For all i £ [n] and t > it holds that 

Y i (t)=X i + Y J X 3 (t-l), (13) 

Proof. We prove by induction on t. The basis t = 1 follows simply from the definitions; since V° is 
empty (the agents only start voting at t = 1) then Ai(0) = — oo and -Bj(O) = oo for all i £ [n], and 

so 

A;(0) = x(^(0), Bi(0)) = x(-oo, oo) = 0, 

by the definition of x(-, •) (Eq. ©). Another consequence of the fact that V° is empty is that 

F[S=l\Ai] 
Yl{l)=l ° S F[S = 0\A l ] 

and so for t = 1 the statement of the theorem (Eq. (fl3|) ) reduces to 

log F[S = 0\A i ]~ *' 

which is precisely the definition of Xi = x{A{). 

Assume the statement holds for all t' < t and all i 6 [n], We will show that it holds for t and 
all i. Since, as we note above, V^(t') = 1 iff Yi(t') > 0, then by the inductive assumption we have 
that 

Vi(t') = 1 iff Xi + Y, Xjtf - 1) > (14) 



or 



V l {t') = l _^X i (t'-l)<A i 



i¥* 



= 1 {-J2x{aj(V t '- l )MV t '- 1 ))<X i J, 

where the second equality follows by substituting the definition of Xi(t'). Hence Vi(t') is equivalent 
to either a lower bound (if it equal to 1) or upper bound (if it is equal to 0) on X{. 
Therefore the event V 1 = v l is equal to the event that 

X i >-£s(a i (w*'- 1 ),& i (w t '- 1 )) 

for all i and t' such that Vi(t') = 1 and 

X i <- 1 £ i x{a j (^- 1 ),b j (^~ 1 )) 

for all i and t' such that V{(t') = 0. Equivalently, for all i£ [n]: 



t'<t 



Substituting the definitions of ai (Eq. (|S])) and bi (Eq. Q), this event is equal to the event 
for all i £ [n] (note that this means that Ai(t') < X{ < Bi(t') for all i and t'). Therefore 



(15) 



S = s 



V* = / 



and also 



Hence 



S = s 



Ai = u , V 1 ' = / 



S = s 



S = s 



a^v 1 ) < Xi < bi{v l ) for i £ [n] 



Ai=uj, djiy*') < Xj < 6,(i/) for i / j 



log' 



[S = l\Ai =oj,V 



t-i = gt-i] 



log 



[S = l\Ai = u, ajiv*- 1 ) < Xj < bjiv 1 - 1 ) for i / j] 



• [S = 0\Ai = u, djiv 1 - 1 ) < Xj < bjiv*- 1 ) for i + j] ' 
Again invoking Bayes' law, we have that 



log- 



[S = l\Ai 



u .v*- 1 =v t ~ 1 ] 



[S = 0\Ai =00, V 1 



-1 _ ,-li-ll 



log 



[Ai = w|5 = 1] -n- P [^(v* -1 ) < Xj < 6i(v* _1 )|5 = l] 



n 



< [^ = w |5 = 0] ■!■! P [^-(w*- 1 ) < Xj < bjCu*- 1 )^ = 0] 



since the private signals are independent, conditioned on S. Substituting the definition of x(co) 
(Eq. flU)) and the definition of x(-, •) (Eq. (JSJ)) yields 



w ,y*-i =v*~ 1 l 



io g F [ 5 ~ 1 ! A ~" y '' ~" j =xM+y I ( flJ r i ),j )J r i )). 

1 ! J j# 



(16) 



Finally, since 



then by Eq. (fT5j) 



*<(*) = log ■ 



[5'=l|^,y f - 1 ] 

[s = o|^,v*-i] 



Mt) = x(Ai) +^x{a 1 (V t - 1 ),b j (V t - 1 )), 
and the theorem follows by substituting Xi = x(A{) and Xj(t — 1) = x(aj(V t-1 ),&j-(V'* -:l )) 



We are now ready to prove our main theorem for this subsection. 



□ 



Theorem (|1.7|) . Fix no and n\, and let n be the number of agents. Assume that X, as well 
as the cumulative distribution functions of v$ and V\, can be calculated in constant time. Then 
there exists an algorithm with running time 0{nt), which, given i's private signal Ai and the votes 
V 1 ^ 1 = {Vj(t') : j £ [n], t! < t}, calculates Vi(t), agent i's vote at time t. 



10 



Proof. By Eq. (|12p we have that Vi{t) is a simple function of Yi(t). By Theorem 1131 above. Yi(t) 
can be calculated in 0{n) by adding Xj (which we assume can be calculated in constant time given 
Ai) to the sum over j ^ i of £(A,-(i), Bj(t)). By Eq. ©, x(a, 6) can be calculated in constant time, 
assuming the cumulative distribution functions of vq and v\ can be calculated in constant time. 

We have therefore reduced the problem to that of calculating Aj(t) = o ? -(V' t ) and Bj(t) = bjiy*). 
However, the definitions of aj and bj (Eqs. (JH]) and Q) are in fact simple recursive rules for 
calculating aj(v t ) and bj(v l ) for all j € [n], given a*~ (w' _1 ) and ft*- - (w i_1 ) for all j E [n]: it follows 
directly from Eqs. © and ([9]) that 

__ fmaxla^^^^xla,^- 1 ), M*' -1 ))} if Vi (t) = 1 
[oj(w 4_1 ) otherwise 

with an analogous equation for 6j(u ). 

Note that the sum ^j^t x (°i(^ i_1 )' ^j(^ t_1 )) needn't be calculated from scratch for each i; one 
can rather sum over all j € [n] once and subtract the appropriate term for each i. Hence calculating 
a,j(v ) and bjffi) (for all j) given their predecessors takes 0(n), and the entire recursive calculation 
takes 0(nt). D 

2.2 Unanimity 

Our model is a special case of that of Gale and Kariv [6], and as such, by a proof they provide, if 
two voters converge then they converge to the same vote. However, their paper leaves untreated 
the possibility that the voters don't converge. We prove that consensus is reached with probability 
one. 

Before proving the theorem we prove some standard claims. Recall the definition of x(-,-) 
(Eq. ©): 

F[S = l\a<X<b] 
X ^ b)=lOg F[S = 0\a<X<bY 

Claim 2.7. Let a, b be such that x(a, b) is well defined (i.e., P [a < X < b\S = 0] > 0). Then 

x(a,b) = logE [e x \a < X <b,S = 0}. (17) 

Proof. By Bayes' law we have that 

F[a<X<b\S = l] 
X ^ b) = lOg F[a<X<b\S = 0y 

Substituting the conditional distributions of X yields 

f dv] (x) 

x(a,b) = log J » U ; . 

fa d Mx) 



x = log^-(x), (18) 

du 



By Claim E: 



and so we have that 



x(a, b) = log £ = log ±g ^ i . 

fa du o{x) f a du (x) 



11 



Recalling that vq is the distribution of X conditioned on S = we have that 

x(a,b) = logE[e x |a<X <b,S = 6\ . 



a 



Recall that we assume that the distribution of X is non-atomic (definition II. ip . Hence the 
following claim is a consequence of Eq. (J17h above, by a standard argument that we omit. 



Claim 2.8. x(a,b) is non- decreasing and continuous in a and in b. 

The following claims follows directly from Eq. (|17p above. 

Claim 2.9. Let a,b be such that x(a,b) is well defined (i.e., P [a < X < b\S = 0] > 0). Then 
a < x(a,b) < b, assuming the distribution of X is non-atomic. 



„x 



Proof. By Eq. (|17p we have that 

e x(a,b) = E 

and so e a < e x ^ a ' ' < e . Since we assume the distribution of X is non-atomic (definition I l.ip then 



e a <e x <e b ,S = 



E 



-X 



e a < e A < e\ S = 



<e° 



and so e a < e x ^ a ' ' < e and the claim follows. 



D 



We now show a condition for unanimity. We will later prove that unanimity occurs w.p. 1 by 
showing that this condition eventually applies, w.p. 1. 



Lemma 2.10. If 



J2(Bi{t)-Mt))< 



£* 



(19) 



then there exists a V such that Vi(t') = V for all i and t' > t. I.e., unanimity is reached at time t. 

Proof. We first note that, since Ai(t) is non-decreasing and B{(t) is non-increasing then if Eq. [19] 
holds at time t then it also holds at all times t' > t. 

Now, recall that X^t) = x(Ai(t) , -Bj(i)) . By Claim EJ] we have that Ai(t) < X^t) < Bi(t). 
From Eq. (fT5|) it follows that the same holds for Xj too: Ai{t) < X{ < Bi(t). Hence \Xi(t) — Xi\ < 
Bi(t) — Ai{t) and for all i € [n] we have that 



J2Xj(t)-X 3 



<Y,i B S)-Mt)) 



(20) 



Recall that 



Y i (t + l)=X i + Y J X j (t), 
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and so 






Therefore by Eq. (|2Q|) we have that 



y# + i)-J>i 



<E (*;(*) -4>(*))- 



By the theorem hypothesis this implies that 



Yi(t + l)-Y,Xj 



< 



E^ 



Hence Y^t + 1) and £V Xj have the same sign. Since V t {t + 1) = 1 iff Y(t + 1) > (Eq. (jl2]) ) 
then we have shown that at time £ + 1 all agents vote identically. Since if Eq. [19] holds for time t 
then it also holds for time t + 1 then we've shown that for all t' > t the agents will agree in every 
round. It remains to show that they don't all change their opinion, as a group. 

Now, if the agents all vote 1 at time t + 1 then, by the definition of Ai(t) and Bi(t), it holds 
that Bi(t + 2) = Bi(t + 1) and Ai(t + 2) > Ai(t + 1). Since by Claim \Z8\x(a,b) is non-decreasing 
in a, then we have that Xi(t + 2) > Xi(t + 1) for all i, and so Yi(t + 2) > Y{(t + 1) for all i. Hence 
the agents will all vote 1 at time t + 2. The same argument applies when all the agents vote at 
time t + 1, and the proof follows by induction on t. □ 

We make another definition before proceeding to prove the main theorem of this subsection. 
Recall the definitions of a*, bi, A{ and B{\ 



and 



cbiiy 1 ) = max 

t'<t S.t. Vi(t')=l 



bi(v ) = min 

t'<t S.t. Vi(t')=0 



^(a^- 1 ),^- 1 )) 



j¥=i 



E*^- 1 ).^- 1 )) 



j¥* 



> . 



with j4j(t) = aj(y*) and Sj(t) = 6i(V*). As we noted above Aj(t) is non-decreasing in t and Bi(t) 
is non-increasing in t. Hence they have limits which we denote by Ai(oo) and -Bj(oo). Furthermore, 

if as above we denote X{(t) = x iajiV 1 ), bjiy* ) j then 



M°o) 



sup 

v s.t. Vi(t')=i 



E^w 



i¥* 



(21) 



and 



BAoo) = inf 

i' S.t. Vi(t')=0 




(22) 
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Note that since Xi(t) = x(Ai(t),Bi(t)), and since x(a,b) is a continuous function of a and b 
(Claim ES]) then 

limX i {t)=x(A i (oc),B i (oo)). (23) 

£— s-oo v ' 

Theorem (|1.4p . With probability one there exists a time t u and a vote V £ {0, 1} such that for all 
t > t u and agents i it holds that Vi{t) = V . 

Proof. Assume by way of contradiction that unanimity is never reached, and so by Lemma l2.10l for 
all t it holds that ^ B. L (t) — A.- L (t) > | £\ Aj|. Then, since Bi(t) — Ai(t) is monotonically decreasing, 
it holds that 



lim VB,(i)-i,(t)> 

-.—h-dC) * ■ 



t— >oo 



5> 



(24) 



We consider two cases: 



1. lim^oo £j A^W = 

We assume (definition II. ip that the distribution of Xi is non-atomic, and so Yli X% ^ with 
probability 1. Hence by Eq. (|24p there must be some agent i for which 



lim (Bi(t) - Ai(t)) = Bi(oo) - Ai(oo) > 0. 
£— »oo 

Assume w.l.o.g. V^(i) = 1 infinitely many times. Hence, by Eq. (|2ip . we have that 

^j(oo) > - lim y^Xi(t) 

t— s-oo * — ' 

lim XAt) - lim V£(t). 



t— ¥oo t— s-oo ■ 



Since we assume in this case that lim^oo ^ • Xj(t) = then we have that 

Aj(oo) > lim Aj(t). 



t— s-oo 



But since Aj(oo) < Bj(oo) then by Eq. ([23]) and Claim [2~9l we have that Ai(oo) < limt_ > . 00 Xi(t), 
which is a contradiction. 

The intuition here is that when i votes 1 it is revealed that Xi > Xi(t) — ^ ■ Xj(t). Hence if 
Y^j Xj(t) is very small then Ai(t) approaches Xi(t) arbitrarily closely, which is impossible if 
Ai(t) is to stay well separated from Bi{t). 

2. lim^oo^.A^/O 

Assume w.l.o.g. that Z := lim^oo ]T\ Xj(t) > 0. Since unanimity is never reached then there 
must be some i for which Yi(t) < infinitely many times. Hence by Eq. (|22[) we have that 

Bi(oo)<- lim y]x t (t) 

t— s-oo ' — » 



t— s-oo i— s-oo ■ 
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Since by assumption Hindoo ^ • Xj(t) > then we have that 

Bi(oo) < lim Xi(t). 

t-^oo 

However by Eq. (|23p and Claim [2T9l we have that B{(oo) > limj_>. (X) Xj(t), which is a contra- 
diction. 

In this case the intuition is that when i votes even though ^2jXj(t) is positive, then Bi 
decreases by at least ^ • Xj(t), which cannot continue indefinitely when linif_ ) . 00 £\. Xj(t) > 0. 

D 

2.3 Monotonicity 

Since the agents base their decisions on a growing information base, their decisions become more 
and more likely to be correct. We prove this formally below, using a standard argument. 

Theorem (|1.5p . For all agents i and times t > 1, it holds that 

¥[Vi(t) = S]>F[Vi(t-l) = S]. 

Proof. As noted in Eq. ([3]), Vi{t) is the choice in {0, 1} that maximizes the probability of matching 
the state of the world, given A% and V 1 ^ 1 . Let / be an arbitrary function of Ai and V 1 ^ 1 . Then: 

P [Vi(t) = S\Ai, V 1 - 1 ] > P [f(Ai, V 1 - 1 ) = S\A h V 1 - 1 ] . 

Since Vi(t — 1) is also a function of Ai and V t ~ l then we can substitute Vi(t — 1) for f(Ai, V t ~ 1 ) in 
the equation above, and the theorem follows. □ 

Note that P [Vi(t) = S] is strictly larger than P [Vi(t - 1) = S] whenever P [Vi(t) / Vi(t - 1)] is 
positive, i.e. when the decision may change. 

2.4 Asymptotic Learning 

We show that with high probability after observing the first round of voting all voters know the 
correct state of the world, and a unanimous and correct decision is reached at the second round of 
voting. Note that by the monotonicity theorem (|1.5|) . this means that the same holds for all rounds 
after the second round. 

Before proving this theorem we will prove the following claim. 

Claim 2.11. 

PLY < -C\S= 1] <e~ c . 
Proof. Recall that by Claim El 

X = - — (x) 
du 

and so 

i>oo i>oo 



/OO /'OO 

dv {X) = I e- x du x {X). 
-00 J —oo 
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But uq is a probability measure, and so J_ OQ dvo(X) = 1. Hence 

/oo 
e~ x ^i(X) = 1. 
-oo 

Therefore by the Markov bound 

P [X < -C\S = 1] = P [e~ x > e c |5 = 1] < e" c 

□ 

We are now ready to prove the main theorem of this subsection. 

Theorem (jl.6p . Fix Hq and \i\, and let n be the number of agents. Then there exist constants 
C = C(//o,A i i) and hq = no^Oi/^i) such that 

P|Vt: VJ(2) = S}> l-e' Cn 

for all n > uq. 

Proof. We shall show that there exists a constant C = C(po,fii) such that 

P [Vi : Vi(2) = 1\S = 1] > 1 - e" Cn 

for all n large enough. Since the same argument can be used to show an analogous statement for 
5 = then this will prove the theorem. 

Recall that Vi(t) is the indicator of the event Yi(t) > 0, where 

P\S = l\A l ,V t - 1 ] 
Yi(t) = log ' ' 



[S = 0\A i ,V t - 1 ]' 
Invoking Bayes' law and the conditional independence of the private signals we get that 

*(2) = log (m^A n F[v ^ s=1 

t[ - } S l P[A i \S = 0] i -^.P[V j (l)\S = 0} 

Denote by iVj the number of agents other than i who vote 1 in the first round: 

N i = \{j S .t. Vj(l) = l,j^i}\. 

Then since Xi = log p k'L~ then we can write 

m _ Xi + N , log ;it;>°'!-'i + ( » _ i _ W() i„ g p t* s 015 = i] 



•[X>0\S = 0] - l ' & P[X<0|S = 0]' 

Denote 

ai = P[X> 015 = 1] and a = P [X > 0|5 = 0] , 

and note that x(0, oo) > by Claim [2T9l and so since x(0,oo) = logai/ao then ot\ ^ ao- We can 

now write 

£(2) = ^ + TV* log ^ + (n - 1 - Ni) log ^— ^i. 

ao 1 — ao 
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Denote 

OL\ , 1 — Qi 

Wt = N t log — + {n-Ni) log 

a 1 - q 

so that 1^(2) = Xj + Wj. Since E [iVj|5 = 1] = (n — \)ct\ then the conditioning Wi on 5 = 1 we get 
that 

E [Wi|5 = 1] = (n - l)ai log £!l + ( n - 1)(1 - ai) log ^— ^i. 

If we denote (n— 1)D = E [Wj|5 = 1] then D is the Kullback-Leibler divergence [8] of two Bernoulli 
distributions with expectations a\ ^ olq. Hence D > 0. 

Now, conditioned on S the private signals are independent, and hence so are the votes at round 
1, since Vi(l) depends on Ai only. Therefore Wi, conditioned on S = 1, is the sum of n — 1 bounded 
independent random variables. Therefore by the Hoeffding bound there exists a constant C\ such 
that 

P [Wi < \{n - 1)D\S = 1] < e- Cl{n ~ l) . (25) 

By Claim EH] we have that P [X t < -\{n - \)D\S = l] < e~2 D{n - l \ which, together with 
Eq. (|25|) and the union bound yields 

P [^(2) = 0|5 = 1] < e -^ D{n - 1] + e"^^ 1 ). 
Therefore, again using the union bound it follows that 

P [3i : Vi(2) = 0\S = 1] < n{e-^ D{n - 1} + e^ 1 ^ 1 )). 
Finally, it follows that for n large enough there exists a constant C such that 

P [V* : Vi(2) = 1\S = 1] > 1 - e~ Cn . 

D 
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